LING 385: Lecture 11 | Notes by Cole Gawin

Knowledge and Recognition

in a feedforward NN, each neuron has knowledge as a sequence of weights, and a bias
recognition is done gradually by processing neurons, taking the input and transforming it to the output
each neuron does a very simple calculation:
1. inner product inputs with weights
2. add bias
3. perform activation function (e.g. sigmoid)
where do the weights and biases come from?

deep learning neural networks learn through making mistakes
initial weights and biases are randomly assigned before the learning process begins
NN processes input data using these random weights and biases
output is compared to the desired output to determine the error or loss.
- if the output matches the desired result, no changes are made to the weights and biases
- if there is a mismatch, indicating a mistake, NN adjusts the weights and biases to minimize the loss
loss is calculated as the the computed output subtracted from the desired output
goal is to learn from mistakes and update weights and biases to reduce future errors

randomly select an input-output pair, and process it through the NN
- desired output is the original label
- computer output is the result of the NN
measure the loss of the output
- if it's near 0, keep the parameters (w's and b) as they are
- otherwise, improve the parameters
repeat this process until computed and desired outputs consistently match
loss is related to the difference between computed and desired outputs
- process of adjusting parameters to minimize loss takes time to learn
learning good parameters (w's and b's) may require thousands of iterations or epochs
focus on adjusting parameters to minimize loss, applying the psychology of learning from mistakes
consider whether increasing or decreasing a parameter (e.g., w or b) improves or worsens the loss
- if increasing the parameter increases the loss, decrease it
- if increasing the parameter decreases the loss, increase it

Old learning law: New Estimate = Current Estimate - Loss
- Problem: Learns in one fell swoop, unlike human learning which occurs gradually through small error corrections.
Solution: Introduce a small number (e.g., 0.1 or 10 percent)
New Learning Law: New Estimate = Current Estimate - 0.1*Loss